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ABSTRACT 

We extend our previous physically-based halo occupation distribution models to in- 
clude the dependence of clustering on the spectral energy distributions of galaxies. 
Cn| \ The high resolution Millennium Simulation is used to specify the positions and the 

velocities of the model galaxies. The stellar mass of a galaxy is assumed to depend 
only on Mi n t a ii, the halo mass when the galaxy was last the central dominant object 
of its halo. Star formation histories are parametrized using two additional quantities 
that are measured from the simulation for each galaxy: its formation time (tj orm ), and 
the time when it first becomes a satellite (tinfall)- Central galaxies begin forming stars 
at time tf orm with an exponential time scale r c . If the galaxy becomes a satellite, its 
star formation declines thereafter with a new time scale t s . We compute 4000 A break 
[ — \ strengths for our model galaxies using stellar population synthesis models. By fitting 

■ these models to the observed abundances and projected correlations of galaxies as a 

function of break strength in the Sloan Digital Sky Survey, we constrain r c and t s as 
functions of galaxy stellar mass. We find that central galaxies with large stellar masses 
have ceased forming stars. At low stellar masses, central galaxies display a wide range 
O ' of different star formation histories, with a significant fraction experiencing recent 

starbursts. Satellite galaxies of all masses have declining star formation rates, with 
similar e-folding times, r s ~ 2.5 Gyr. One consequence of this long e-folding time is 
that the colour-density relation is predicted to flatten at redshifts > 1.5, because star 
formation in the majority of satellites has not yet declined by a significant factor. This 
is consistent with recent observational results from the DEEP and VVDS surveys. 

J-j \ Key words: galaxies: fundamental parameters - galaxies: haloes - galaxies: distances 

and redshifts - cosmology: theory - cosmology: dark matter - cosmology: large-scale 
structure 
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1 INTRODUCTION 

The most fundamental metric of a galaxy is its luminosity, 
which serves as a rough indicator of its total mass. Another 
fundamental metric of a galaxy is its colour or spectral type, 
which is usually interpreted as an indicator of its recent star 
formation history (although metallicity and dust are both 
known to affect galaxy colours). In the local Universe, galaxy 
luminosities and colours are known to be strongly correlated; 
luminous elliptical galaxies are much redder than the less 
luminous spirals. It has also long been known that the clus- 
tering of galaxies is a strong function of morphological type 
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nous galaxies being more strongly clustered than fainter 
ones. 

Recent large surveys such a s 2dfGRS (Colles s"et alj 

l200lf ) and SDSS (|York et alJ^OOOf ) have allowed the covari- 
ance between g alaxy luminosity and colour to be broken. 
iNorberg et al. I (|2002l l showed at all luminosities, galaxies 
with spectral features indicative of a "passive" old stellar 
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population have higher correlation am plitude than galaxies 
with ongoing star fo rmation. Likewise IZehavi et all (2005) 
and iLi et al.l (|2006aT j measured the clustering of red and 
blue galaxies as a function of luminosity and stellar mass, 
and showed that red galaxies are more strongly correlated 
and have a correlation function with a steeper slope. The 
strongest differences between the red and blue correlation 
functions occurred for galaxies with the lowest luminosities 
and stellar masses. 

In order to interpret these results, we need to under- 
stand the underlying physics that causes the dependence 
of the correlation function on colour/spectral type. Semi- 
analytic models of galaxy formation are able to provide 
considerable insight into the processes th at determine how 
galax i es wi t h different properti e s cluster (iKauffmann et al.l 
I1997I . Il999l ; iBenson et ail |2000| ; ICroton et all 120061 ). These 
models use N-body simulations of the dark matter to specify 
the locations of galaxies, and they invoke simple prescrip- 
tions to describe processes such as gas cooling, star forma- 
tion and supernova feedback. 

In the semi-analytic models (SAMs), there are two rea- 
sons why galaxies transition from star-forming (blue) to pas- 
sive (red) systems. One is a consequence of the infall of a 
galaxy onto a larger halo. When this occurs, the galaxy is 
stripped of its supply of infalling gas and its star forma- 
tion rate declines as its cold gas reservoir is depleted. This 
means that there will be a popul ation of red satellite g alaxies 
located in groups and clusters (|Diaferio et al.|[200ll ). How- 
ever, this process by itself is not sufficient to explain the 
very strong observed dependence of colour on galaxy lu- 
minosity. Some other process must act to terminate star 
formation in the central galaxies of massive dark matter 
haloes. In recent models |Croton et alj 120061 ; iBower et al.l 
120061 ; ICattaneo et al.l I2006I ). feedback from active galactic 
nuclei (AGN) has been invoked as a possible mechanism for 
suppressing star formation in massive central galaxies. How- 
ever, the details of the AGN feedback process and the trun- 
cation of star formation in infalling satellite galaxies remain 
poorly constrained, so in reality there is still considerable 
freedom when attempting to specify the star formation his- 
tories of galaxies in these models. The colour distributions 
generated by the SAMs should thus be treated as indicative 
rather than quantitative predictions of the models. 

The Halo Occupation Distribution (HOD) approach by- 
passes any consideration of the physical processes impor- 
tant in galaxy formation. It specifies how galaxies are re- 
lated to dark matte r halo s in a purely statistical fashion, 
van den Bosch et all (|2003l ) was the first to model the clus- 
tering properties of early and late-type galaxies in the con- 
text of HOD models by using observational data from the 
2dF redshift survey to constrain the average number of early 
and late-type galaxies of giv en luminosity residi ng in a halo 
of given mass. More recently. IZehavi et al l (I2005T ) built HOD 
models that were able to reproduce the correlation function 
of red and blue galaxies as a function of luminosity, by defin- 
ing a blue galaxy fraction as a function of dark halo mass. 
The blue fraction also depended on whether the galaxy was 
a central or a satellite system. The results of both studies 
demonstrate that the strong clustering of faint red galaxies 
can be explained if nearly all of them are satellite systems in 
high-mass halos. The se results hav e recen tly been extended 
to higher redshifts by IPhleps et all (|2006h . The main disad- 



vantage of the HOD approach is that it remains unclear how 
one progresses from a purely statistical characterization of 
the link between galaxies and dark matter halos, to a more 
physical understanding of the galaxy formation process it- 
self. 

In previous work l|Wang et al.ll2006l) . we attempted to 
build a physically-based HOD model that would combine 
the advantages of the statistical HOD models with those of 
the semi-analytic approach. A large hi gh resolution N-body 
simulation, the Millennium Simulation (jSpringel et al 120051 ) 
was used to follow the merging paths of dark matter halos 
and their associated substructures and to specify the posi- 
tions and the velocities of the galaxies in the simulation box. 
The properties of the galaxies (in this case, their luminosi- 
ties and stellar m asses) were specified using simple param- 
eterized functions. IWang et al.l (|2006r ) chose to parametrize 
the luminosities and masses of the galaxies in their mod- 
els in terms of the quantity M;„/ a ;;, which was defined as 
the mass of the dark matter halo when the galaxy w as last 
the central galaxy of its own halo. lWang et all {2006) tested 
this para metrization u s ing th e semi-analytic galaxy cata- 
logues of ICroton et al.l ((2006) and and then applied it to 
data from the Sloan Digital Sky Survey. They were able to 
show that the relation between stellar mass and halo mass 
inferred from the combination of the models and the cluster- 
ing data was in good agreement with independent measure- 
ments of this relation usin g galaxy-galaxy lensing techniques 
l|Mandelbaum et al.lliooeh . 

In this paper, we extend our method to model the de- 
pendence of clustering on the spectral energy distributions 
of galaxies. For galaxies of given stellar mass, we assume the 
star formation history to depend not only on stellar mass, 
but also on whether the galaxy is a central object or a satel- 
lite in the simulation. We fit this model to the colour distri- 
butions of galaxies of give stellar mass, as well as to their cor- 
relation functions split by colour. We choose to focus on the 
spectral index D4000 rather than a more traditional broad- 
band colour. The D4000 index is defined as the ratio of the 
flux in two bands at the long- and short-wavelength side 
of the 4000A discontinuity. This 4000A break arises from 
a sum of many absorption lines produced by ionized metals 
in the atmosphere of stars. Because the absorption increases 
with decreasing stellar temperature, the D4000 break gets 
larger with older ages, and is largest for old and metal-rich 
stellar populations. Therefore it is a good indicator of the 
star formation history of a galaxy. In this work we adopt 
the narrow definiti on of the 4000A break introduced by 
iBalogh et al. (1999), and denote it as D„4000. Galaxies with 
large and small values of D n 4000 are referred to as "old" and 
"young" respectively. Because D n 4000 is defined in a narrow 
wavelength interval, it is insensitive to the effects of dust. 

The parametrized models that we construct extend the 
old ones in which stellar mass M* is assumed to depend 
only on M in f a u by assuming that the star formation history 
of each galaxy declines exponentially after its first appear- 
ance in the simulation with time constant r c (M*) as long 
as it is a central galaxy, and then with a different time con- 
stant r s (M«) after it becomes a satellite galaxy. Note that 
this strongly resembles the approach adopted in the semi- 
analytic models. The main difference is that the time scales 
T c and t s and their dependence on mass are not specified 
using any fixed "recipe" ; we allow these time scales to be di- 
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rectly constrained by the observational data. As we will see, 
this approach allows us to draw interesting physical conclu- 
sions from a comparison of our models with the SDSS data. 

In Sec. [5] we present observational results from the 
Sloan Digital Sky Survey (SDSS). These include the 
D n 4000/colour distributions of galaxies in different stellar 
mass ranges and the projected two point correlation func- 
tion of red/blue and old/young galaxies. In Sec. [3] we outline 
our basic theoretical concepts and describe how we decided 
to model the star formation histories of the galaxies, and 
calculate their spectral energy distributions. In Sec. [4] wc 
describe how we fit the models to the data. Interpretation 
and tests of our results are given in Sec. [S] Conclusions and 
discussions are presented in the final section. 



2 THE OBSERVATIONAL RESULTS 

Our observational results are based on a sample of ~200,000 
galaxies drawn from the SDSS Data Release Two. The 
galaxies have 0.01 < z < 0.3, 14.5 < r < 17.77 and 
—23 < Mo.i r < —16, where r is the r-band Petrosian ap- 
parent magnitude corrected for foreground extinction, and 
Mo.i r is the r-band absolute magnitude corrected to redshift 
z — 0.1. This sample has formed the basis of our previous 
investigations of the correlation function, the power spec- 
trum, pairwise velocity dispersion distributions, an d the lu- 
minos ity an d the stellar mass fu nction of galaxies (|Li et al.l 
l2006al lbh. In IWang et all l|2006h . we made use of the mea- 
surements of the projected correlation function for galaxies 
in bins of luminosity and stellar mass to constrain the re- 
lation between galaxy luminosity /stellar mass and Mi n f a u. 
In this paper, we extend this analysis to the spectral energy 
distributions of galaxies. In this section, we focus on the dis- 
tributions of D n 4000 and colour and we explore how thee 
projected two-point correlation functions split by D n 4000) 
and colour change as a function of M* . 

The galaxies in our sample are divided into four sub- 
samples according to their stellar mass. Each of the stellar 
mass subsamples is then divided into two further subsamples 
according to Dn, 4000 and q — r, using a method similar to 
that adopted in iLi et all l|2006al ). We fit bi-Gaussian func- 
tions to the distribution of D n 4000 and g — r as a function of 
stellar mass. The division into high D n 4000 and low D n 4000, 
red and blue, is defined to be the mean of the two Gaussian 
centres in each stellar mass bin. In the computation of the 
D n 4000 and g — r distributions, we have corrected for incom- 
pleteness by weighting each galaxy a factor of V a urvey/V m ax- 
Vsurvey is the volume for the sample, and V mra is the maxi- 
mum volume over which the galaxy could be observed within 
the sample redshift range and within the range of r-band ap- 
parent magnitudes. 

For each subsample, the redshift-space two point cor- 
relation f u nctio n(2PCF) £ s (r p ,7r) is measured using the 
lHamilton] (| 19931 ') estimator. The projected 2PCF w(r p ) is 
then estimated by integrating ^"(r p ,n) along the line-of- 
sight direction 7r, with \ir\ ranging from to 40 h~ x Mpc. 
When computing 2PCFs, two different methods for con- 
structing random samples have been used: the "standard" 
method in which the redshift selection function is explic- 
itly modelled using the luminosity function, and the more 
"general" method in which only the sky positions of the 



observed galaxi es are randomly re-assigned. As shown in 
ILi et al.l (|2006ah . the 2PCFs obtained with the two meth- 
ods are in good agreement, and here we will use the more 
"general" method. We have also carefully corrected for pos- 
sible biases, such as the variance in mass-to-light ratio and 
the small-scale d eficiency in the 2PCF due to fibre collisions 
|Li et alj|2006ah . This ensures accurate measurements for 
correlation functions on small scales. 

To take into account the effect of "cosmic variance" 
on the w(r p ) measurements, we have constructed a set of 
10 mock galaxy catalogues from the Millennium simulation 
with exactly the same geometry and selection function as 
the real sample. The effect of cosmic variance is modelled 
by placing a virtual observer randomly inside the simulation 
box when constructing these mock catalogues. The detailed 
procedure fo r constructing t hese mock catalogues has been 
presented in lLi et all l|2006ci ). For each mock catalogue, we 
divide galaxies into subsamples according to stellar mass and 
g — r colour, in the same way as for the real sample, and 
we measure w(r p ) for these subsamples. The la variation 
between these mock catalogues is then added in quadrature 
to the bootstrap errors. Note that the errors are assumed to 
be the same for the splits by g — r and by D n 4000. 

Fig. 1 shows the distributions of D„4000 and g—r colour 
for galaxies, as well as the projected 2PCF w(r p ) for the 
"red" and "blue" subsamples in four stellar mass ranges. 
As can be seen, in each stellar mass interval, D n 4000/<7 — r 
shows a bimodal distribution, with the fraction of galaxies in 
the red peak increasing towards higher masses. Older/redder 
galaxies of all stellar masses are more strongly clustered and 
have steeper correlation functions than their younger/bluer 
counterparts. This age/colour dependence is much stronger 
for the low mass galaxies than for the high mass galaxies, 
particularly on small scales. Notice that the clustering for 
subsamples split by D n 4000 and g — r are quite similar, but 
the distribution functions of these two quantities are quite 
different, particularly at low masses. For example, for the 
lowest mass bin (9.5 < log(M, /Mq) < 10), the fractions of 
red and blue galaxies are comparable, but the fraction of 
galaxies with large D„4000 values is much smaller than that 
of the low D n 4000 population. When building our model, 
we will first focus on the D n 4000 spectral index, and test to 
what extent a model that reproduces the observed trends as 
a function of D„4000 will also work for g — r colour. 



3 THEORETICAL CONCEPTS 

In the paper of lWang et all (|2006h . we used the Millennium 
Simulation to construct a model to describe the clustering 
of galaxies as a function of their luminosities and stellar 
masses. The positions and velocities of the galaxies in the 
simulation box were obtained by following the orbits and 
merging histories of the substructures in the simulation. 
Parametrized functions were then adopted to relate the lu- 
minosities and stellar masses of the galaxies to the quantity 
Mi n f a u, defined as the mass of the halo at the epoch when 
the galaxy was last the central dominant object in its own 
halo. By fitting both the stellar mass function and the pro- 
jected correlation function w(r p ) measured in five different 
stellar mass bins, we were able to use the SDSS data to con- 
strain the link between galaxy stellar mass M* and Min.ja.ll- 
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Figure 1. Observational results from SDSS in bins of stellar mass. The left two columns show D n 4000 distributions and correlation 
functions split by D„4000. Red/blue lines represent subsamples with larger/smaller values of D n 4000. The right two columns are for 
g — r; the red and blue lines here represent clustering of red and blue subsamples. 



This relation was parametrized using a double power-law of 
the form: 



M» = 



M 



in fall ■ 

AI a ■ 



The scatter in log(M*) at a given value of Mmfaii was de- 
scribed using a Gaussian function with width a. Our best-fit 
model to the SDSS data had the following parameters: Mo = 
4.0 x 10 11 /i" 1 M o , a = 0.29, (3 = 2.42, log A; = 10.35 and 
a = 0.203 for central galaxies and M = 4.32 x 10 11 /i" 1 M Q , 
a = 0.232, p = 2.49, log k = 10.24 and a = 0.291 for satel- 
lite galaxies. Q 



1 Note these parameter s are slightl y diffe rent from those pub- 
lished in the paper of IWang et ; all d2006h . because there was 
a small change in the definition of "first progenitor" when 
building the dark matter subhalo tree in the simulation (see 
iDe Lucia fc Blaizoti fcOOrj) for more details about this modifica- 
tion). Nevertheless, the best fit relation is almost the same as 
previously obtained, and all the conclusions of the paper remain 
unchanged. 



In this paper, our aim is to extend this model to describe 
not only the masses and luminosities of galaxies, but also 
their colours and spectral energy distributions. A successful 
model should be able to reproduce the 4000 A break strength 
distribution at each stellar mass, as well as the correlation 
functions split by D n (4000) shown in Fig. 1. 

In standard semi-analytic models, galaxies that re- 
side at the centre of their dark matter halo are called 
central galaxies, and those that have been accreted into 
larger structures are termed satellite galaxies. The simplest 
model one could imagine for differentiating galaxies in the 
high/low D n 4000 or red/blue peaks of the bimodal distri- 
butions shown in Fig. 1 would be that these two peaks cor- 
respond to these satellite and central galaxy populations. 
Central galaxies have young stellar populations and more 
active star formation because of ongoing cooling and gas 
accretion. Satellite galaxies are older because they run out 
of gas after they are accreted or their gas is removed by 
processes such a s ram pressure stripping. In early semi- 
analytic models dKauffmann et al.l 1 19931 : ICole et"al 1 11994 
ISomerville fc Primacklll999r ). this was indeed the case; red 
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Figure 2. Left panel: fractions of central (blue) and satellite (red) systems in bins of stellar mass in the models of I Wang et al, | ll2006t) . 
Black symbols show the fraction of galaxies in the high D n 4000 peak from SDSS. Right panel: the clustering of central (blue) and satellite 
(red) galaxies in the model compared with low/high D„4000 subsamples in the SDSS (upper and lower black curves ). 
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Figure 3. Distribution of infall times for satellite galaxies in 
different stellar mass bins. The x-axis shows time from past to 
present, with 13.7Gyr corresponding to the present day. 



stellar masses, nearly all galaxies in the real Universe are 
"old". Fig. [2] shows that the majority of these massive old 
galaxies must be central galaxies. 



In the right panel of Fig. blue and red lines show 
the projected two-point correlation functions for central and 
satellite galaxies in the simulations in 4 different stellar mass 
ranges. Results from the SDSS for the low and high D n 4000 
subsamples are plotted in black. This again shows the failure 
of a simple central/satellite dichotomy as a way of explain- 
ing the difference between the "old" and "young" galaxy 
populations in the SDSS. As can be seen, the difference in 
clustering amplitude between the two galaxy populations 
decreases as a function of increasing stellar mass, while the 
difference in clustering strength between central galaxies and 
satellite galaxies remains approximately constant. 



galaxies were mainly satellite galaxies and blue galaxies were 
the central galaxies of their own haloes. 

Fig. [5] shows that this picture does not fit the SDSS 
observations. In the left panel of Fig. [2] blue and red lines 
show the fraction of central and sat ellite galaxie s as a func- 
tion of stellar mass in the model of I Wang et "ail i|2006l ). For 
comparison, diamonds show the fraction of galaxies in the 
high D n 4000 peak as a function of M* , as measured from the 
SDSS data. As can be seen, the fraction of satellite galaxies 
in the simulation does not match the fraction of "old" galax- 
ies in the real Universe, except at stellar masses ~ IO 1O M . 
At lower stellar masses, the fraction of satellite galaxies is 
higher than the fraction of old galaxies, implying that some 
low mass satellite systems are still forming stars. At high 



It is important to remember that satellite galaxies were 
not all "created" at the same time. Fig.|3]shows distributions 
of the times at which satellite galaxies of different stellar 
masses were first accreted by a larger structure. The satellite 
infall(i.e. accretion) times ti n f a ii are randomly distributed 
between the two simulation snapshots when the galaxy first 
transitions from a being central object in its own halo to a 
satellite system. As can be seen, high mass satellites have 
on average been accreted more recently than low mass satel- 
lites. This effect goes in the wrong direction to resolve the 
discrepancies shown in Fig. [2] As we have discussed, a sub- 
stantial number of low mass satellite galaxies are required to 
have "young" stellar populations, but Fig. [3] shows that low 
mass satellites typically become satellites quite early on. 
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Figure 4. The evolution of D n 4000/<? — r as a function of time for different values of the star formation timescale parameter r c for 
typical central galaxies. Solid lines are for solar metallicity and dashed lines show results for 0.25 solar metallicity. The coloured lines 
show results for three different values of T r . 



Table 1. Best— fit parameter values for star formation histories in four stellar mass bins as derived from SDSS Dn4000 distribution and 
galaxy correlation functions split by D n 4000. 
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9.5-10 


[-1.6,2.2] 


77 


0.025 


(39.97, -4.589, 5.767) 


2.312 


0.992 


4.214 


3.126 


10-10.5 


[-1,1.2] 


45 


0.175 


(5.724, 69.95, 2.106) 


2.364 


1.041 


4.014 


2.639 


10.5-11 


[-1,1] 


41 


0.318 


(3.147, 6.900, 2.081) 


2.491 


0.534 


5.267 


4.239 


11-11.5 


[-0.8,1.2] 


41 


0.411 


(2.435, 3.845, 1.953) 


2.103 


0.082 


7.061 


5.837 



4 PARAMETRIZATION OF THE STAR 

FORMATION HISTORIES OF GALAXIES IN 
THE MODEL 

4.1 Computation of the SEDs 

Recent se m i-anal yti c models llCroton et al.l I2006J: 
Bower et all I2006J ; iKang et all I2006J ; ICattaneo et"al] 
2006J ) have attempted to resolve some of the difficulties 
outlined in the previous section by including "AGN feed- 
back". The main effect of this form of feedback is to move 
the central galaxies of massive dark matter haloes to the 
red sequence. Our approach is different; rather than assume 
some process that suppresses star formation in massive 
galaxies, we parametrize the star formation histories of the 
galaxies in our simulation using simple functions, and we 
allow the parameters of our model to be constrained by the 
SDSS data. 

The formation time of each galaxy is defined as the time 
when the halo of the galaxy is first found in the simulation. 
We also know the infall times of all satellite galaxies, i.e. 
the times when they first became satellites. To model the 
star formation histories of our galaxies, we assume that the 
star formation rate of a galaxy declines exponentially with 
time after its formation and depends on the stellar mass (at 
redshift 0) of the galaxy. We also assume that if a galaxy 
is accreted and becomes a satellite, its star formation ther- 



after declines with a different e-folding time t s (M*,). The 
model therefore has two timescales r c (M«) for central galax- 
ies T S (M„) for satellite galaxies. 

The star formation rate of a galaxy can thus be written: 

„__, , _ J e~ t//Tc , central galaxies 

b*H{t)-^ e -t central /T Ce -(t-t CBntral )/-rs , satellite galaxies 

where the age of a galaxy t is calculated starting from its 
formation time tf orm ation, which is assigned to a random 
time between the simulation snapshot when the halo of the 
galaxy was first found and the immediately preceding snap- 
shot, tcentral — tin f all — tf orm ation, is the time that the 
galaxy spends as the central object of its own halo. 

The resulting spectral energy distributions (SEDs) of 
the galaxies in the simulation are comp uted using the stel- 
lar p opulation sy nthesis model of BC03 l|Bruzual fc Charlotl 
120051 ) . assuming a lChabr ier (2003) IMF. Spectral properties 
such as the 4000 A break strength and colour depend on 
the met allicity of the galaxy as well as on its star formation 
history. iGallazzi et alj (|2005l ) show that there exists a rela- 
tion between stellar metallicity and stellar mass for galaxies 
in the local Universe. We use the mean relation derived in 
their paper to specify the metallicity of the galaxies in our 
simulation at a given value of M* . Fig. |4]shows the evolution 
of D n 4000/<? — r with time for three different values of the y c 
parameter (j c —l /r c (Gyr)) for a typical central galaxy in our 
model. Results are shown for solar metallicity BC03 mod- 
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Figure 5. Our best fits to the D n 4000 distributions and the correlation functions split by D n 4000 in different stellar mass bins. In the 
left panels, green lines are the full fits, while red/blue dashed lines show the contributions from satellite/central galaxies. In the middle 
panels, red/blue lines are best-fit correlation functions for subsamples with larger/smaller value of D n 4000. The SDSS results are shown 
in black. The right panels show the the distributions of ~f c recovered by our non-parametric technique, with dashed lines showing the 
contribution of each individual Gaussian. 



els (solid curves), as well as for 0.25 solar models (dashed 
curves). As can be seen, there is a small but significant de- 
pendence of D n 4000 and g — r on metallicity, particularly 
for galaxies with short star formation time scales. Note also 
that this computation of the spectral energy distribution ne- 
glects the effects of dust on the light emitted by the galaxy. 



In the following sections, we will concentrate on the 4000 
A break strength, because of its very weak dependence on 
dust. 
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Figure 7. The results of a full non— parametric fit to the distribution of star formation timescales of satellite and central galaxies in the 
stellar mass range 10 10 — 1O. 1O 5 M0. In the right panel, the black curve shows the result of the non-parametric method, while the green 
line is from the best-fit Gaussian of t s . 



4.2 Fitting the Data 

For each galaxy in the Millennium simulation, we have de- 
termined t f orma tion and t in f a u. We have also built a "li- 
brary" of predicted present-day D n 4000 values by running 
the BC03 models for different combinations of galaxy metal- 
licity z, galaxy lifetime, and the two star formation time 
scales t c and r s . We interpolate over the grid of parame- 
ter values stored in the library to obtain D n 4000 for all the 
galaxies in the simulation. In addition, we know the (x,y,z) 
positions of the galaxies within the simulation volume , and 
their stellar masses have already been specifi ed using the 
mode l developed as part of our previous study ( Wan get al.l 
2006). We therefore have all the necessary ingredients to 
calculate the D„4000 distributions as well as the correlation 
functions split by D n 4000 for different ranges in stellar mass 
and to compare the model predictions with the observations. 

Initially we attempted to parametrize the distributions 
of To and t s using simple Gaussian functions. However, this 



resulted in rather poor fits to the data. Further experimen- 
tation indicated that it would be advantageous to switch to 
a method that would allow the distributions of r c and r s 
to take on complex shapes. F ollowing the methodology out- 
lined bv lBlanton et al.l (|20Q3h . the distribution of 7 C = l/r c 
is parameterized by a sum of many Gaussians with mean 
values 7fc equally distributed in a given range: 

We assume the same scatter a for each Gaussian, but al- 
low the weighting factors N% to vary. For each stellar mass 
interval, the range and number of Gaussians used in this 
non-parametric fitting technique are listed in Table 1. The 
central values 7*, of each Gaussian are equally distributed 
over the range with a step of 0.05. a for each Gaussian is 
fixed to be the same as this step width. This method al- 
lows the distribution of 7 values to take on any shape. This 
approach turned out to be critical for central galaxies, but 
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resulted in little improvement when applied to the satel- 
lite population (see Sec. 5.1). We therefore employ the non- 
parametric method only for the central galaxies. For satellite 
galaxies, a simple Gaussian centred at < r s > and with the 
scatter of oy s is used to parametrize the distribution of t s . 

During the fitting procedure, we also noticed that con- 
straining 7 C to be positive did not allow us to reproduce the 
blue end of D n 4000 distribution. We therefore allowed the 
parameter j c to take on both positive and negative values. 
In practice, a negative value of y c corresponds to a galaxy 
that is experiencing an elevated level of star formation at 
the present day (i.e. a starburst). For satellite galaxies, the 
time scale r c is constrained to be positive, which is in any 
case preferred by our fits. 

To find a best-fit to both the D n 4000 distribution and 
the galaxy correlation functions split by D n 4000, we employ 
the Levenberg-Marquardt algorithm, which interpolates be- 
tween the Gauss-Newton algorithm and the method of gra- 
dient descent. The final results of the fitting procedure are 
shown in Fig. [5] For each stellar mass bin, D n 4000 distri- 
butions and correlation functions for high/low D n 4000 sub- 
samples are shown together with the distribution of 7 C that 
produces the best fit. In the left panel, blue and red dashed 
curves show results for central and satellite galaxies, while 
green lines are for both types of galaxy. In the middle panel, 
red and blue curves refer to high and low D n 4000 subsam- 
ples defined using the same technique that was applied to 
the SDSS data. The distribution of 7 C that results in these 
fits is shown in the right panels. Table 1 lists the param- 
eters of the best fit models for four stellar mass bins. For 
central galaxies, the median value of 7 C is listed, as well as 
the timescale r c = l/7c, its median value and its 16 and 
84 percentile values. The table also lists < r s > and ov s , 
the parameters that describe the star formation histories of 
satellite galaxies. We estimate our parameters by minimiz- 
ing the quantity: 



" Xdi. 



Xcorr 
■At corr 
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Xdis / v 



P ~ PSDSS 
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/Ccorr 



E 

N corr 



w(r p ) - w(r p ) S DSS 



a(w(r p ) S DSs) 



where Xdis ls evaluated for the D n 4000 distribution and 
Xcorr is evaluated for the projected correlation functions of 
the high and the low D n 4000 subsamples. For each stellar 
mass bin, Ndis is the number of points along the D n 4000 
distribution that we fit. In practice, we adopt Ndis = 50 
with D n 4000 in the range [0.5,3]. N cor r is the number of 
points on the correlation function used in the fit. We adopt 
N corr = 20 x 2 with r v ranging from 0.113 to 8.972/i _1 Mpc. 



5 THE RESULTS 

5.1 Discussion of the Results 

From Fig.[S]and Table. 1, it is clear that star formation histo- 
ries of low mass and high central galaxies are very different. 
At large stellar masses (M» > 1O 1O ' 5 M0), nearly all central 
galaxies have positive y c , and a large fraction of them have 
ceased forming stars. There is a peak in the distribution 
of 7 C at a value of around 0.45, which corresponds to an e- 
folding timescale of ~ 2 Gyr. Together with a minority of old 
and massive satellite galaxies, these central galaxies in which 
star formation has shut down are necessary to explain the 
strong peak in the D n 4000 distribution at values of ~ 1.8, 
characteristic of metal-rich, evolved stellar populations. 

At low stellar masses, galaxies display a much wider va- 
riety in star formation history. This is especially true in our 
lowest stellar mass bin (9.5 < log (M,/M©) < 10). Nearly 
half the galaxies in this mass range have negative y c . In ad- 
dition, there are also some objects that have "switched off', 
i.e. the distribution of 7 C exhibits a tail toward large positive 
values. 

In the upper panel of Fig. [6] we show what happens 
to our fit in this stellar mass range if 7 C is forced to take 
on only positive values. The fitting procedure is exactly the 
same as described in Sec. 4, but 7fc is constrained to lie in 
the range [0.05,2]. The result shows that the blue end of 
the D„4000 distribution is not well fit without a population 
of "star-bursting" galaxies. In the BC03 model, a constant 
star formation rate will result in a D n 4000 value of no less 
than ~ 1.2 for a galaxy with an age of a few Gyrs (see green 
lines in Fig. |4]). It is thus not possible to obtain low enough 
values of D n 4000 to match the observations, unless we allow 
for negative values of <y c . 

In the lower panel of Fig. [6] we test what happens if we 
do not allow the distribution of 7^ to extend to very large 
positive values. If we truncate the distribution at a value 
of 0.5, we still find fairly good fit with 3 = 4.836, which 
is comparable to our best-fit model. This implies that the 
existence of a long tail of galaxies with large positive values 
of 7^ is not strongly preferred, i.e. our model is not sensitive 
to the exact timescale over which the star formation was 
truncated in these "old" central galaxies. One possibility is 
that these galaxies correspond to a post-starburst phase in 
which star formation has been temporarily reduced following 
exhaustion of the gas or blow-out of a significant fraction of 
the interstellar medium. 

From the values of t s listed in Table 1, it can be seen 
that unlike central galaxies, satellite galaxies of all masses 
have similar e-folding timescales, with an average value of 
around 2 — 2.5 Gyr. This indicates that all satellite galax- 
ies experience a similar decline in star formation rate after 
falling into a large structure. To test the robustness of this 
conclusion, and to see if a Gaussian is sufficient to describe 
the dispersion in r s values, we have carried out a full non- 
parametric fit to constrain the shape of both the 7 C and the 
7s (7s = 1/t"s) distributions for galaxies in the stellar mass 
ranee 

10 io-io.5 Mo The results 

are shown in Fig. [7] with the 
panel on the far right showing the derived distribution of 7 S . 
The distribution of 7s obtained using the non-parametric fit- 
ting method is somewhat different to the simple Gaussian 
fit (indicated on the plot by the green curve). Nevertheless, 
the resulting median value of r s is 2.257 Gyr, which is about 
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Figure 8. The distribution of specific star formation rates predicted by our model (red dashed lines) is compared with results derived 
from the SDSS (black solid lines), in different stellar mass bins. 




the same as that of the simple Gaussian fit. The small dif- 
ference in the distribution of r s between the two methods 
makes very little difference to the fit or to the resulting H 
values. 

5.2 Consistency Checks 

So far, we have tuned the parameters r c and r s to reproduce 
the D n 4000 distributions of SDSS galaxies as a function of 
stellar mass and the projected correlation functions of high 
D n 4000 and low D n 4000 galaxy subsamples. We have chosen 
to focus on D n 4000 because it is relatively insensitive to the 
effects of dust. However, D n 4000 is only one of many possible 
age- indicators, so it is interesting to check whether we obtain 
consistent results for other measures of the star formation 

hist ory of a galaxy. 

iBrinchmann et all (|2004T ) computed specific star forma- 
tion rates (i.e. the star formation rate per unit stellar mass) 
for SDSS galaxies. For star-forming galaxies, the values were 
derived using emission line fluxes suitably corrected for the 
effects of dust, so this indicator of recent star formation his- 
tory is independent of D n 4000. For galaxies with absent or 
very weak emission lines, the specific star formation rates 
were in fact estimated using the D„4000 index, so this is no 
longer an independent measure. Nevertheless it is interest- 
ing to test whether the distribution of SFR/A/* predicted by 
our best-fit model is in agreement with the results derived 
directly from the SDSS data. 



In Fig. [8] we show the results of this test. The red 
dashed curves in the figure show the distributions of the spe- 
cific star formation rates predicted by our best fit model in 
four different bins of stellar mass. To calculate these values, 
we simply integrate the star formation rate over the lifetime 
of a galaxy to get its total stellar mass and we then divide 
the star formation rate at the present day by this value. 
The fraction of stellar mass that is returned to the interstel- 
lar medium over the lifetime of the galaxy is taken to be 0.5, 
which is the median value predicted by the BC03 model. The 
black lines show the results obtained directly from our SDSS 
sample, which have beed corrected for volume incomplete- 
ness with the 1/Vmax weighting scheme (the same method 
as for computing D n 4000 and g — r distributions described 
in Sec. 2) . From the plot we see that the agreement between 
our model "predictions" and the observations is reasonably 
good. The qualitative trends as a function of stellar mass 
are reproduced quite well, but the absolute values of the 
specific star formation rate predicted by the model tend to 
be somewhat lower than in the data. 

We now test whether the same models allow us to repro- 
duce the colour distributions of galaxies. As we have noted, 
galaxy colours are quite sensitive to the effects of dust, so it 
would be surprising to us if this were the case! 

Fig. [9] shows the predicted g — r colour distributions in 
the absence of dust and indeed, it appears that the model 
does not fit very well. At low stellar masses, the model pro- 
duces too many extremely blue galaxies and not enough 
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Figure 10. The medi an star formation hi s tories of satellite galaxies in different stellar mass bins. Black lines show results from the 
semi-analytic models of |Pe Lucia fc Blaizotl ll2006h . while red dashed lines are from our models. The SFR is normalized to its value at 
the infall time. On the x-axis, the time is plotted relative to t in f a u. 



galaxies in the red peak. This tells us that a substantial 
number of galaxies in this red peak are actually star-forming 
galaxies that are strongly reddened by dust. At high stellar 
masses, the predicted shape of the g — r colour distribution 
is similar to the observations, but there is an offset in the 
predicted colours of galaxies in the red peak as compared to 
the observations. The existence of an offset in g — r of around 
0.1 mag for high stellar mass galaxies has been noted sev- 
eral times in th e literature on s tellar population models (see 
section 2.4.3 in iGallazzi et all (|2005T ) and appendix of BC03 
for more details about this colour mismatch for galaxies with 
old stellar populations). One possibility is that the offset is 
related to the effects of non-solar element abundance ratios 
on the SEDs, something which is not currently accounted 
for by the BC03 models. 

In summary, our results show that although very similar 
qualitative results are obtained when using different spectral 
indicators, there are non-negligible quantitative differences. 
In particular, the effects of dust on galaxy colours must be 
taken into account when interpreting colour distributions 
and the clustering properties of galaxies split by colour. This 
problem is not as serious for high mass galaxies, which are 
mainly ellipticals with little gas, ongoing star formation or 
dust. However, it is a major issue for low mass galaxies, 
where the shapes of the distribution of D n 4000 and g — r 
are quite different. As we have already seen from Fig. [TJ the 
relative fractions of high/low D n 4000 and red/blue galaxies 
also differ substantially in our lowest mass bins. 



5.3 Comparison with the Results of Semi-analytic 
Models 

As discussed in Sec. 1, the star formation histories of galaxies 
predicted by modern semi-analytic models ought to be sim- 
ilar to those in our model. In the SAMs, the star formation 
rates of galaxies also decline after they transition from cen- 
tral galaxies to satellite systems. In addition, modern SAMs 
also incorporate AGN feedback mechanisms that act to shut 
down star formation in the central galaxies of massive dark 
matter halos. It is interesting to investigate whether there is 



"quantitative" agreement between our results and those of 
the semi-analytic models. 

In Fig. 1101 we compare the median star formation his- 
tory of satellite galaxies in our model (red dashed lines) 
with the median star formation hist ory of satellites in 
the se mi-analytic galaxy catalogues of |Pe Lucia fc Blaizotl 
(|2006t ) (black lines). Results are shown for four different stel- 
lar mass bins. On the x-axis, we plot the time relative to 
tin fail- On the y-axis, we plot the star formation rate nor- 
malized to its value at t = ti n f a ii- As can be seen, the main 
difference between the two models lies in the behaviour of 
the star formation after the galaxy is accreted as a satellite. 
In the semi-analytic model, star formation declines much 
more rapidly than in our model. In our model, the median 
star formation e-folding time scale of satellites is around 
2 — 2.5 Gyr, independent of galaxy stellar mass. In the 
semi-analytic models, the e-folding time is closer to 1 Gyr. 
The median star formation rates during the phase when the 
galaxies are central objects are actually quite similar in the 
two models, except for the very most massive galaxies. In 
the semi-analytic models, most of the massive galaxies ap- 
pear to have experienced a short period of very intense star 
formation in their past, probably reflecting an early phase 
of gas-rich merging. 



5.4 Evolution to Higher Redshifts 

As we showed in the previous section, there are significant 
differences in the star formation histories of satellite galax- 
ies in_our_nio^sl_ascomrjared to the semi-analytic model 
oflDe Lucia fc Blaizotl ((2006). As a result, the colours and 
spectral properties of satellites galaxies in the two models 
will differ substantially at high redshifts, because the time 
between tinfall an d t b serva ti n will be considerably smaller 
than at the present day. If the star formation e-folding 
timescale of satellite galaxies is long, one would expect many 
of these systems to be still blue a nd actively star form ing at 
higher redshifts. Re cent work bv lCucciati et all (|2006h and 
ICooper et all (|2006h has found that the steep colour-density 
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relation observed at low redshift becomes weaker and grad- 
ually disappears by a redshift of ~ f .5. 

In Fig. If f I we plot the relations between D n 4000 and 
local density in our model (upper panels) and in the semi- 
analytic model (lower panels). Results are shown at 5 dif- 
ferent redshifts and for two different ranges in the quantity 
Mi n faii. In the semi-analytic model, D n 4000 is calculated 
using the BC03 modefl The local density is evaluated by 
counting the number of galaxies within a sphere of radius 2 
Mpc centred on each galaxy and dividing by the correspond- 
ing number in randomly distributed spheres of the same 
size. We only include galaxies with Mi n f a u greater than 
2 x 10 h~ 1 MQ when calculating these densities in order to 
make sure that our results are not affected by the resolution 
limit of the simulation. Lines with different colours show the 
median D n 4000 for galaxies as a function of density at red- 
shifts 2 = 0, 0.3, 0.8, 1.5, 2, 3. In both models, the D n 4000- 
density relations are shallower for galaxies with higher values 
of Minfaii, reflecting the fact that the star formation histo- 
ries for satellite galaxies are rather similar at high masses. 
However, the evolution with redshift differs substantially in 
the two models. In our models, the D n 4000-density relations 
become much flatter at higher redshifts and they basically 
disappear by a redshift of ~ 1. In contrast, the relations in 
the semi-analytic models remain steep up to redshifts ~ 3. 
It is interesting that for galaxies with Minfaii > 10 12 - 5 M Q , 



2 Note that this index is not currently available in the publically 
released galaxy catalogues. 



the trend in colour as a function of density exhibits a non- 
monotonic behaviour at higher redshifts in the SAM. 

The main reason why the colour density relation of low 
mass galaxies remains steep out to high redshifts in the semi- 
analytic models is because r s is short. As we noted in the 
previous subsection, at redshift zeros t s is ~ 1 Gyr in the 
SAMS as compared to ~ 2.5 Gyr for our best-fit model. 
In SAMs, the star formation rate is assumed to be invesely 
proportional to the dynamical time of the galaxy and hence, 
for a given value of M in f a u, t s will be smaller at higher red- 
shifts. In our model, we have assumed that t s is independent 
of redshift. 



6 SUMMARY AND CONCLUSIONS 

We have extended the phys ically-based halo o ccupation dis- 
tribution (HOD) models of lWang et all i|2006T ). which relate 
the stellar mass of the galaxy to the mass of its halo at the 
time it was last a central dominant object. The models pre- 
sented in this paper consider the dependence of clustering 
on the spectral energy distributions (SEDs) of galaxies. The 
star formation history of a galaxy is assumed to depend on 
stellar mass and on the time at which the galaxy transitions 
from being a central galaxy to a satellite galaxy orbiting 
within a larger structure. T h e stel lar population synthesis 
model of iBruzual fc Charlotl l|2003T ) is used to compute the 
spectral energy distributions of the galaxies in the model. 
Rather than colour, we focus on the spectral index D n 4000 
because of its weak dependence on dust. By fitting both the 
bimodal distribution of D n 4000 and the projected correla- 
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tion functions for high/low D n 4000 subsamples for SDSS 
galaxies in 4 different stellar mass ranges, we constrain the 
star formation e-folding time of central and satellite galaxies 
as a function of stellar mass at z = 0. 

Our results show that at high stellar masses, a large 
fraction of central galaxies have ceased forming stars. This 
shutdown in star formation is necessary to explain the fact 
that the majority of massive galaxies are red and o l d. It i s 
also consistent with the conclusions of Croton et alj (2006); 
iBower et ail (|2006h : ICattaneo et all ^OOd 1 ). who show that 
a shut-down of gas cooling and star formation in massive 
dark matter halos results in a much better fit to present- 
day galaxy luminosity functions and colour distributions. In 
these models, radio AGN feedback is invoked as the mecha- 
nism responsible for this suppression of cooling. 

For low stellar masses, central galaxies display a wide 
range of different star formation histories. A significant frac- 
tion of low mass central galaxies are experiencing starbursts. 
We also find a "tail" of low mass central galaxies in which 
star formation is currently suppressed, but our model is not 
sensitive to e xactly when this suppr ession occurred. In a 
recent paper, iKauffmann et all (|2006h studied the star for- 
mation histories of local galaxies by analysing the scatter 
in their colours and spectral properties, and concluded that 
star formation occurs in shorter, higher amplitude events in 
smaller galaxies. It is interesting that we come to very sim- 
ilar conclusions using a completely different method. Com- 
pared to central galaxies, satellites have a much narrower 
distribution of star formation e-folding timescales. In satel- 
lites, the average e-folding time does not depend on stellar 
mass and has a value of around 2 — 2.5 Gyr. 

We have also checked whether our best-fit model pa- 
rameters derived from the distributions and clustering of 
galaxies as a function of D n 4000 can be extended to un- 
derstanding the distribution of galaxy colours. Our conclu- 
sion is that the effects of dust must be taken into account 
when modelling colours. This is especially true of low mass 
galaxies, which can be actively star-forming and yet contain 
sufficient dust to match the colours of high mass ellipticals. 

Finally, we have compared the star formation histories 
of galaxies in our model with the semi-analytic models of 
galaxy formation od De Lucia & Blaizot (2007). The main 
difference between the two approaches is in the timescale 
over which star formation declines in satellite galaxies once 
they are accreted by larger structures. In the semi-analytic 
models, star formation in satellites decreases with an e- 
folding time of about 1 Gyr. In our models, the decrease 
occurs on a timescale that is a factor of 2 — 3 times longer. 
This leads to a conflict between the colours of satellites in 
the models and in the SDSS data . Simi lar conclusions have 
been reached bv lWeinmann et al.l d2006h. who show that the 
semi-analytic models of ICroton et al.l (|2006l ) predict a blue 
fraction of satellites that is too low to match results from 
their catalogue of galaxy groups extracted from the SDSS. 

Assuming our derived star formation parameters to ap- 
ply at all times , we predict a weakening of the colour- 
density relation towards higher redshifts and a complete 
disappearance of this relation at redshift of about 1.5. This 
is consistent with recent work bv lCucciati et al.l (|2006^ and 
ICooper et al.l ((2006) . In contrast, a strong colour-density re- 
lation is maintained in the De Lucia & Blaizot (2007) semi- 
analytic model up to redshifts greater than 3. 
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